NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Temporal Difference Learning with Compressed Updates: Error-Feedback meets Reinforcement Learning

Mitra, Aritra; Pappas, George; Hassani, Hamed (April 2024, Transactions on machine learning research)

Full Text Available
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning

Zhanh, Chenyu; Wang, Han; Mitra, Aritra; Anderson, James (January 2024, The Twelfth International Conference on Learning Representations (ICLR))

Federated reinforcement learning (FRL) has emerged as a promising paradigm for reducing the sample complexity of reinforcement learning tasks by exploiting information from different agents. However, when each agent interacts with a po- tentially different environment, little to nothing is known theoretically about the non-asymptotic performance of FRL algorithms. The lack of such results can be attributed to various technical challenges and their intricate interplay: Markovian sampling, linear function approximation, multiple local updates to save communi- cation, heterogeneity in the reward functions and transition kernels of the agents’ MDPs, and continuous state-action spaces. Moreover, in the on-policy setting, the behavior policies vary with time, further complicating the analysis. In response, we introduce FedSARSA, a novel federated on-policy reinforcement learning scheme, equipped with linear function approximation, to address these challenges and provide a comprehensive finite-time error analysis. Notably, we establish that FedSARSA converges to a policy that is near-optimal for all agents, with the ex- tent of near-optimality proportional to the level of heterogeneity. Furthermore, we prove that FedSARSA leverages agent collaboration to enable linear speedups as the number of agents increases, which holds for both fixed and adaptive step-size configurations.
more » « less
Full Text Available
Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling

Adibi, Arman; DalFabbro, Nicolo; Schenato, Luca; Kulkarni, Sanjeev; Poor, Vincent; Pappas, George; Hassani, Hamed; Mitra, Aritra (May 2024, PMLR)

Full Text Available
Graph-theoretic approaches for analyzing the resilience of distributed control systems: A tutorial and survey

https://doi.org/10.1016/j.automatica.2023.111264

Pirani, Mohammad; Mitra, Aritra; Sundaram, Shreyas (November 2023, Automatica)

Full Text Available
Linear Stochastic Bandits over a Bit-Constrained Channel

Mitra, Aritra; Hassani, Hamed; Pappas, George (June 2023, Learning for Dynamics and Control)

One of the primary challenges in large-scale distributed learning stems from stringent communication constraints. While several recent works address this challenge for static optimization problems, sequential decision-making under uncertainty has remained much less explored in this regard. Motivated by this gap, we introduce a new linear stochastic bandit formulation over a bit-constrained channel. Specifically, in our setup, an agent interacting with an environment transmits encoded estimates of an unknown model parameter to a server over a communication channel of finite capacity. The goal of the server is to take actions based on these estimates to minimize cumulative regret. To this end, we develop a novel and general algorithmic framework that hinges on two main components: (i) an adaptive encoding mechanism that exploits statistical concentration bounds, and (ii) a decision-making principle based on confidence sets that account for encoding errors. As our main result, we prove that when the unknown model is d-dimensional, a channel capacity of O(d) bits suffices to achieve order-optimal regret. We also establish that for the simpler unstructured multi-armed bandit problem, 1 bit channel capacity is sufficient for achieving optimal regret bounds.
more » « less
Full Text Available
Linear stochastic bandits over a bit-constrained channel

Mitra Aritra; Hassani, Hamed; Pappas, George (June 2023, Learning for Dynamics and Control Conference)

One of the primary challenges in large-scale distributed learning stems from stringent communication constraints. While several recent works address this challenge for static optimization problems, sequential decision-making under uncertainty has remained much less explored in this regard. Motivated by this gap, we introduce a new linear stochastic bandit formulation over a bit-constrained channel. Specifically, in our setup, an agent interacting with an environment transmits encoded estimates of an unknown model parameter to a server over a communication channel of finite capacity. The goal of the server is to take actions based on these estimates to minimize cumulative regret. To this end, we develop a novel and general algorithmic framework that hinges on two main components:(i) an adaptive encoding mechanism that exploits statistical concentration bounds, and (ii) a decision-making principle based on confidence sets that account for encoding errors. As our main result, we prove that when the unknown model is -dimensional, a channel capacity of bits suffices to achieve order-optimal regret. We also establish that for the simpler unstructured multi-armed bandit problem, bit channel capacity is sufficient for achieving optimal regret bounds.
more » « less
Linear Stochastic Bandits over a Bit-Constrained Channel

Mitra, Aritra; Hassani, Hamed; Pappas, George (May 2023, Learning for Dynamics and Control Conference)

One of the primary challenges in large-scale distributed learning stems from stringent communication constraints. While several recent works address this challenge for static optimization problems, sequential decision-making under uncertainty has remained much less explored in this regard. Motivated by this gap, we introduce a new linear stochastic bandit formulation over a bit-constrained channel. Specifically, in our setup, an agent interacting with an environment transmits encoded estimates of an unknown model parameter to a server over a communication channel of finite capacity. The goal of the server is to take actions based on these estimates to minimize cumulative regret. To this end, we develop a novel and general algorithmic framework that hinges on two main components: (i) an adaptive encoding mechanism that exploits statistical concentration bounds, and (ii) a decision-making principle based on confidence sets that account for encoding errors. As our main result, we prove that when the unknown model is d-dimensional, a channel capacity of O(d) bits suffices to achieve order-optimal regret. We also establish that for the simpler unstructured multi-armed bandit problem, 1 bit channel capacity is sufficient for achieving optimal regret bounds. Keywords: Linear Bandits, Distributed Learning, Communication Constraints
more » « less
On the computational complexity of the secure state-reconstruction problem

https://doi.org/10.1016/j.automatica.2021.110083

Mao, Yanwen; Mitra, Aritra; Sundaram, Shreyas; Tabuada, Paulo (February 2022, Automatica)

Full Text Available
Distributed State Estimation over Time-Varying Graphs: Exploiting the Age-of-Information

https://doi.org/10.1109/TAC.2021.3130882

Mitra, Aritra; Richards, John; Bagchi, Saurabh; Sundaram, Shreyas (November 2021, IEEE Transactions on Automatic Control)

Full Text Available
A New Approach to Distributed Hypothesis Testing and Non-Bayesian Learning: Improved Learning Rate and Byzantine Resilience

https://doi.org/10.1109/TAC.2020.3033126

Mitra, Aritra; Richards, John A.; Sundaram, Shreyas (September 2021, IEEE Transactions on Automatic Control)

Full Text Available

« Prev Next »

Search for: All records